Search CORE

187 research outputs found

Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

Author: Uçar Bora
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/08/2014
Field of study

Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Combinatorial problems in solving linear systems

Author: Duff Iain
Uçar Bora
Publication venue: HAL CCSD
Publication date: 12/04/2011
Field of study

42 pages, available as LIP research report RR-2009-15Numerical linear algebra and combinatorial optimization are vast subjects; as is their interaction. In virtually all cases there should be a notion of sparsity for a combinatorial problem to arise. Sparse matrices therefore form the basis of the interaction of these two seemingly disparate subjects. As the core of many of today's numerical linear algebra computations consists of the solution of sparse linear system by direct or iterative methods, we survey some combinatorial problems, ideas, and algorithms relating to these computations. On the direct methods side, we discuss issues such as matrix ordering; bipartite matching and matrix scaling for better pivoting; task assignment and scheduling for parallel multifrontal solvers. On the iterative method side, we discuss preconditioning techniques including incomplete factorization preconditioners, support graph preconditioners, and algebraic multigrid. In a separate part, we discuss the block triangular form of sparse matrices

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

High Performance Parallel Algorithms for the Tucker Decomposition of Sparse Tensors

Author: Kaya Oguz
Uçar Bora
Publication venue: HAL CCSD
Publication date: 01/10/2015
Field of study

International audience—We investigate an efficient parallelization of a class of algorithms for the well-known Tucker decomposition of general N-dimensional sparse tensors. The targeted algorithms are iterative and use the alternating least squares method. At each iteration, for each dimension of an N-dimensional input tensor, the following operations are performed: (i) the tensor is multiplied with (N − 1) matrices (TTMc step); (ii) the product is then converted to a matrix; and (iii) a few leading left singular vectors of the resulting matrix are computed (TRSVD step) to update one of the matrices for the next TTMc step. We propose an efficient parallelization of these algorithms for the current parallel platforms with multicore nodes. We discuss a set of preprocessing steps which takes all computational decisions out of the main iteration of the algorithm and provides an intuitive shared-memory parallelism for the TTM and TRSVD steps. We propose a coarse and a fine-grain parallel algorithm in a distributed memory environment, investigate data dependencies, and identify efficient communication schemes. We demonstrate how the computation of singular vectors in the TRSVD step can be carried out efficiently following the TTMc step. Finally, we develop a hybrid MPI-OpenMP implementation of the overall algorithm and report scalability results on up to 4096 cores on 256 nodes of an IBM BlueGene/Q supercomputer

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Parallel sparse matrix-vector multiplies and iterative solvers

Author: Uçar Bora
Publication venue: Bilkent University
Publication date: 01/01/2005
Field of study

Cataloged from PDF version of article.Sparse matrix-vector multiply (SpMxV) operations are in the kernel of many scientific computing applications. Therefore, efficient parallelization of SpMxV operations is of prime importance to scientific computing community. Previous works on parallelizing SpMxV operations consider maintaining the load balance among processors and minimizing the total message volume. We show that the total message latency (start-up time) may be more important than the total message volume. We also stress that the maximum message volume and latency handled by a single processor are important communication cost metrics that should be minimized. We propose hypergraph models and hypergraph partitioning methods to minimize these four communication cost metrics in one dimensional and two dimensional partitioning of sparse matrices. Iterative methods used for solving linear systems appear to be the most common context in which SpMxV operations arise. Usually, these iterative methods apply a technique called preconditioning. Approximate inverse preconditioning—which can be applied to a large class of unsymmetric and symmetric matrices—replaces an SpMxV operation by a series of SpMxV operations. That is, a single SpMxV operation is only a piece of a larger computation in the iterative methods that use approximate inverse preconditioning. In these methods, there are interactions in the form of dependencies between the successive SpMxV operations. These interactions necessitate partitioning the matrices simultaneously in order to parallelize a full step of the subject class of iterative methods efficiently. We show that the simultaneous partitioning requirement gives rise to various matrix partitioning models depending on the iterative method used. We list the partitioning models for a number of widely used iterative methods. We propose operations to build a composite hypergraph by combining the previously proposed hypergraph models and show that partitioning the composite hypergraph models addresses the simultaneous matrix partitioning problem. We strove to demonstrate how the proposed partitioning methods—both the one that addresses multiple communication cost metrics and the other that addresses the simultaneous partitioning problem—help in practice. We implemented a library and investigated the performances of the partitioning methods. These practical investigations revealed a problem that we call message ordering problem. The problem asks how to organize the send operations to minimize the completion time of a certain class of parallel programs. We show how to solve the message ordering problem optimally under reasonable assumptions.Uçar, BoraPh.D

Bilkent University Institutional Repository

Partitioning sparse rectangular matrices for parallel computing of AAtX

Author: Uçar Bora
Publication venue: Bilkent University
Publication date: 08/01/2016
Field of study

Ankara : Department of Computer Engineering and Information Science and The Institute of Engineering and Science of Bilkent University, 1999.Thesis (Master's) -- Bilkent University, 1999.Includes bibliographical references.Uçar, BoraM.S

Bilkent University Institutional Repository

Two approximation algorithms for bipartite matching on multicore architectures

Author: Dufossé Fanny
Kaya Kamer
Uçar Bora
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

International audienceWe propose two heuristics for the bipartite matching problem that are amenable to shared-memory parallelization. The first heuristic is very intriguing from a parallelization perspective. It has no significant algorithmic synchronization overhead and no conflict resolution is needed across threads. We show that this heuristic has an approximation ratio of around 0.632 under some common conditions. The second heuristic is designed to obtain a larger matching by employing the well-known Karp-Sipser heuristic on a judiciously chosen subgraph of the original graph. We show that the Karp-Sipser heuristic always finds a maximum cardinality matching in the chosen subgraph. Although the Karp-Sipser heuristic is hard to parallelize for general graphs, we exploit the structure of the selected subgraphs to propose a specialized implementation which demonstrates very good scalability. We prove that this second heuristic has an approximation guarantee of around 0.866 under the same conditions as in the first algorithm. We discuss parallel implementations of the proposed heuristics on a multicore architecture. Experimental results, for demonstrating speed-ups and verifying the theoretical results in practice, are provided

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Comments on the hierarchically structured bin packing problem

Author: Lambert Thomas
Marchal Loris
Uçar Bora
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

International audienceWe study the hierarchically structured bin packing problem. In this problem, the items to be packed into bins are at the leaves of a tree. The objective of the packing is to minimize the total number of bins into which the descendants of an internal node are packed, summed over all internal nodes. We investigate an existing algorithm and make a correction to the analysis of its approximation ratio. Further results regarding the structure of an optimal solution and a strengthened inapproximability result are given

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

On partitioning problems with complex objectives

Author: Kaya Kamer
Rouet François-Henry
Uçar Bora
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

Hypergraph and graph partitioning tools are used to partition work for efficient parallelization of many sparse matrix computations. Most of the time, the objective function that is reduced by these tools relates to reducing the communication requirements, and the balancing constraints satisfied by these tools relate to balancing the work or memory requirements. Sometimes, the objective sought for having balance is a complex function of the partition. We describe some important class of parallel sparse matrix computations that have such balance objectives. For these cases, the current state of the art partitioning tools fall short of being adequate. To the best of our knowledge, there is only a single algorithmic framework in the literature to address such balance objectives. We propose another algorithmic framework to tackle complex objectives and experimentally investigate the proposed framework.Les outils de partitionnement de graphes et d'hypergraphes interviennent pour paralléliser efficacement de nombreux algorithmes liés aux matrices creuses. La plupart du temps, la fonction objectif minimisée par ces outils est liée au besoin de réduire les coûts de communication, tandis que les contraintes d'équilibre à satisfaire sont elles liées à l'équilibrage de la charge ou de la consommation mémoire. Parfois, l'objectif d'équilibre est une fonction complexe du partitionnement. Nous décrivons plusieurs applications majeures de calcul parallèle sur des matrices creuses où de telles contraintes d'équilibre apparaissent. Pour ces exemples, même les outils de partitionnement les plus pointus sont loin d'être adéquats. Pour autant que nous sachions, il n'existe dans la littérature qu'un seul cadre algorithmique qui traite ces problèmes. Nous proposons ici une nouvelle approche algorithmique et fournissons des résultats d'expériences la mettant en œuvre

HAL-ENS-LYON

CiteSeerX

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

Hal-Diderot

On optimal tree traversals for sparse matrix factorization

Author: Jacquelin Mathias
Marchal Loris
Robert Yves
Uçar Bora
Publication venue: HAL CCSD
Publication date: 05/11/2013
Field of study

12 pagesWe study the complexity of traversing tree-shaped workflows whose tasks require large I/O files. Such workflows typically arise in the multifrontal method of sparse matrix factorization. We target a classical two-level memory system, where the main memory is faster but smaller than the secondary memory. A task in the workflow can be processed if all its predecessors have been processed, and if its input and output files fit in the currently available main memory. The amount of available memory at a given time depends upon the ordering in which the tasks are executed. What is the minimum amount of main memory, over all postorder schemes, or over all possible traversals, that is needed for an in-core execution? We establish several complexity results that answer these questions. We propose a new, polynomial time, exact algorithm which runs faster than a reference algorithm. Next, we address the setting where the required memory renders a pure in-core solution unfeasible. In this setting, we ask the following question: what is the minimum amount of I/O that must be performed between the main memory and the secondary memory? We show that this latter problem is NP-hard, and propose efficient heuristics. All algorithms and heuristics are thoroughly evaluated on assembly trees arising in the context of sparse matrix factorizations

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Investigations on push-relabel based algorithms for the maximum transversal problem

Author: Kaya Kamer
Langguth Johannes
Manne Fredrik
Uçar Bora
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

We investigate the push-relabel algorithm for solving the problem of finding a maximum cardinality matching in a bipartite graph in the context of the maximum transversal problem. We describe in detail an optimized yet easy-to-implement version of the algorithm and fine-tune its parameters. We also introduce new performance-enhancing techniques. On a wide range of real-world instances, we compare the push-relabel algorithm with state-of-the-art augmenting path-based algorithms and the recently proposed pseudoflow approach. We conclude that a carefully tuned push-relabel algorithm is competitive with all known augmenting path-based algorithms, and superior to the pseudoflow-based ones.Nous étudions le problème de couplage maximum dans des graphes bipartis. Nous décrivons en détail une version optimisée de l'algorithme en ajustant ses paramètres. L'algorithme est facile à mettre en œuvre. Nous introduisons également de nouvelles techniques pour améliorer la performance de l'algorithme. Sur un large éventail de cas du monde réel, nous comparons l'algorithme Push-Relabel avec des algorithmes basés sur les concepts de chemins augmentants et de pseudoflot récemment proposés. Nous concluons qu'un algorithme de type Push-Relabel soigneusement réglé est en concurrence avec tous les algorithmes connus de type chemins augmentants, et supérieur à ceux de type pseudoflot

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot